{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "1fe6e643-9453-4381-9445-bd471685fb96",
   "metadata": {},
   "source": [
    "## Exploring the SciQ dataset using Autolabel"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "80110a5b-2b3e-45e2-a2da-f6fa00200dff",
   "metadata": {},
   "source": [
    "#### Setup the API Keys for providers that you want to use"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "92993c83-4473-4e05-9510-f543b070c7d0",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "# provide your own OpenAI API key here\n",
    "os.environ['OPENAI_API_KEY'] = 'sk-'"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9c246f85",
   "metadata": {},
   "source": [
    "#### Install the autolabel library"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bc181e31",
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install 'refuel-autolabel[openai]'"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e0ced632",
   "metadata": {},
   "source": [
    "#### Download the dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "1bdeefe1",
   "metadata": {},
   "outputs": [],
   "source": [
    "from autolabel import get_data\n",
    "\n",
    "get_data('multimodal_science_qa')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "16f17787",
   "metadata": {},
   "source": [
    "This downloads two datasets:\n",
    "* `test.csv`: This is the larger dataset we are trying to label using LLMs\n",
    "* `seed.csv`: This is a small dataset where we already have human-provided labels"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "84b014d1-f45c-4479-9acc-0d20870b1786",
   "metadata": {},
   "source": [
    "## Start the labeling process!\n",
    "\n",
    "Labeling with Autolabel is a 3-step process:\n",
    "* First, we specify a labeling configuration (see `config.json` below)\n",
    "* Next, we do a dry-run on our dataset using the LLM specified in `config.json` by running `agent.plan`\n",
    "* Finally, we run the labeling with `agent.run`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "c093fe91-3508-4140-8bd6-217034e3cce6",
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "\n",
    "from autolabel import LabelingAgent"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "c93fae0b",
   "metadata": {},
   "outputs": [],
   "source": [
    "# load the config\n",
    "with open('config_multimodal_sciq.json', 'r') as f:\n",
    "     config = json.load(f)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e0983618",
   "metadata": {},
   "source": [
    "Let's review the configuration file below. You'll notice the following useful keys:\n",
    "* `task_type`: `question_answering` (since it's a question answering task)\n",
    "* `model`: `{'provider': 'openai', 'name': 'gpt-3.5-turbo'}` (use a specific OpenAI model)\n",
    "* `prompt.task_guidelines`: `'You are an expert at answer science questions...` (how we describe the task to the LLM)\n",
    "* `prompt.few_shot_num`: 10 (how many labeled examples to provide to the LLM)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "08a1f895",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'task_name': 'ScienceQuestionAnswering',\n",
       " 'task_type': 'question_answering',\n",
       " 'dataset': {'label_column': 'answer',\n",
       "  'delimiter': ',',\n",
       "  'image_url_column': 'image_url'},\n",
       " 'model': {'provider': 'openai_vision', 'name': 'gpt-4-vision-preview'},\n",
       " 'prompt': {'task_guidelines': \"You are an expert at answer science questions. Your job is to answer the given question, using the options provided for each question. You'll also be given an image for each question - use that as context as needed. Choose the best answer for the question from among the options provided. Output just the answer (from the given options) and nothing else.\",\n",
       "  'example_template': 'Question: {question}\\nOptions: {choices}\\nAnswer: {answer}'}}"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "config"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "acb4a3de-fa84-4b94-b17a-7a6fac892a1d",
   "metadata": {},
   "outputs": [],
   "source": [
    "# create an agent for labeling\n",
    "agent = LabelingAgent(config=config)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "92667a39",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "c68b3578bc5f4719acf7aa2ca2e3787e",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Output()"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"></pre>\n"
      ],
      "text/plain": []
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┌──────────────────────────┬─────────┐\n",
       "│<span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\"> Total Estimated Cost     </span>│<span style=\"color: #008000; text-decoration-color: #008000; font-weight: bold\"> $3.6768 </span>│\n",
       "│<span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\"> Number of Examples       </span>│<span style=\"color: #008000; text-decoration-color: #008000; font-weight: bold\"> 200     </span>│\n",
       "│<span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\"> Average cost per example </span>│<span style=\"color: #008000; text-decoration-color: #008000; font-weight: bold\"> $0.0184 </span>│\n",
       "└──────────────────────────┴─────────┘\n",
       "</pre>\n"
      ],
      "text/plain": [
       "┌──────────────────────────┬─────────┐\n",
       "│\u001b[1;35m \u001b[0m\u001b[1;35mTotal Estimated Cost    \u001b[0m\u001b[1;35m \u001b[0m│\u001b[1;32m \u001b[0m\u001b[1;32m$3.6768\u001b[0m\u001b[1;32m \u001b[0m│\n",
       "│\u001b[1;35m \u001b[0m\u001b[1;35mNumber of Examples      \u001b[0m\u001b[1;35m \u001b[0m│\u001b[1;32m \u001b[0m\u001b[1;32m200    \u001b[0m\u001b[1;32m \u001b[0m│\n",
       "│\u001b[1;35m \u001b[0m\u001b[1;35mAverage cost per example\u001b[0m\u001b[1;35m \u001b[0m│\u001b[1;32m \u001b[0m\u001b[1;32m$0.0184\u001b[0m\u001b[1;32m \u001b[0m│\n",
       "└──────────────────────────┴─────────┘\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #00ff00; text-decoration-color: #00ff00\">───────────────────────────────────────────────── </span>Prompt Example<span style=\"color: #00ff00; text-decoration-color: #00ff00\"> ──────────────────────────────────────────────────</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[92m───────────────────────────────────────────────── \u001b[0mPrompt Example\u001b[92m ──────────────────────────────────────────────────\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">{</span><span style=\"color: #008000; text-decoration-color: #008000\">\"text\"</span>: <span style=\"color: #008000; text-decoration-color: #008000\">\"You are an expert at answer science questions. Your job is to answer the given question, using the </span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">options provided for each question. You'll also be given an image for each question - use that as context as </span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">needed. Choose the best answer for the question from among the options provided. Output just the answer (from the </span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">given options) and nothing else.\\n\\nYou will return the answer one element: \\\"the correct label\\\"\\n\\n\\nNow I want </span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">you to label the following example:\\nQuestion: Which of the following could Gordon's test show?\\nOptions: ['if the </span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">spacecraft was damaged when using a parachute with a 1 m vent going 200 km per hour', 'how steady a parachute with </span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">a 1 m vent was at 200 km per hour', 'whether a parachute with a 1 m vent would swing too much at 400 km per </span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">hour']\\nAnswer: \"</span>, <span style=\"color: #008000; text-decoration-color: #008000\">\"image_url\"</span>: <span style=\"color: #008000; text-decoration-color: #008000\">\"https://autolabel-benchmarking.s3.amazonaws.com/multimodal_science_qa/200.jpg\"</span><span style=\"font-weight: bold\">}</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1m{\u001b[0m\u001b[32m\"text\"\u001b[0m: \u001b[32m\"You are an expert at answer science questions. Your job is to answer the given question, using the \u001b[0m\n",
       "\u001b[32moptions provided for each question. You'll also be given an image for each question - use that as context as \u001b[0m\n",
       "\u001b[32mneeded. Choose the best answer for the question from among the options provided. Output just the answer \u001b[0m\u001b[32m(\u001b[0m\u001b[32mfrom the \u001b[0m\n",
       "\u001b[32mgiven options\u001b[0m\u001b[32m)\u001b[0m\u001b[32m and nothing else.\\n\\nYou will return the answer one element: \\\"the correct label\\\"\\n\\n\\nNow I want \u001b[0m\n",
       "\u001b[32myou to label the following example:\\nQuestion: Which of the following could Gordon's test show?\\nOptions: \u001b[0m\u001b[32m[\u001b[0m\u001b[32m'if the \u001b[0m\n",
       "\u001b[32mspacecraft was damaged when using a parachute with a 1 m vent going 200 km per hour', 'how steady a parachute with \u001b[0m\n",
       "\u001b[32ma 1 m vent was at 200 km per hour', 'whether a parachute with a 1 m vent would swing too much at 400 km per \u001b[0m\n",
       "\u001b[32mhour'\u001b[0m\u001b[32m]\u001b[0m\u001b[32m\\nAnswer: \"\u001b[0m, \u001b[32m\"image_url\"\u001b[0m: \u001b[32m\"https://autolabel-benchmarking.s3.amazonaws.com/multimodal_science_qa/200.jpg\"\u001b[0m\u001b[1m}\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #00ff00; text-decoration-color: #00ff00\">───────────────────────────────────────────────────────────────────────────────────────────────────────────────────</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[92m───────────────────────────────────────────────────────────────────────────────────────────────────────────────────\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from autolabel import AutolabelDataset\n",
    "ds = AutolabelDataset(\"data/multimodal_science_qa/test.csv\", config=config)\n",
    "agent.plan(ds)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "dd703025-54d8-4349-b0d6-736d2380e966",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "df103fa1aff64b25bc96d5ac0a87b8c0",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Output()"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"></pre>\n"
      ],
      "text/plain": []
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Actual Cost: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0964</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "Actual Cost: \u001b[1;36m0.0964\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━┓\n",
       "┃<span style=\"font-weight: bold\"> accuracy </span>┃<span style=\"font-weight: bold\"> support </span>┃<span style=\"font-weight: bold\"> completion_rate </span>┃<span style=\"font-weight: bold\"> f1     </span>┃\n",
       "┡━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━┩\n",
       "│<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\"> 0.8      </span>│<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\"> 10      </span>│<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\"> 1.0             </span>│<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\"> 0.8667 </span>│\n",
       "└──────────┴─────────┴─────────────────┴────────┘\n",
       "</pre>\n"
      ],
      "text/plain": [
       "┏━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━┓\n",
       "┃\u001b[1m \u001b[0m\u001b[1maccuracy\u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1msupport\u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1mcompletion_rate\u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1mf1    \u001b[0m\u001b[1m \u001b[0m┃\n",
       "┡━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━┩\n",
       "│\u001b[1;36m \u001b[0m\u001b[1;36m0.8     \u001b[0m\u001b[1;36m \u001b[0m│\u001b[1;36m \u001b[0m\u001b[1;36m10     \u001b[0m\u001b[1;36m \u001b[0m│\u001b[1;36m \u001b[0m\u001b[1;36m1.0            \u001b[0m\u001b[1;36m \u001b[0m│\u001b[1;36m \u001b[0m\u001b[1;36m0.8667\u001b[0m\u001b[1;36m \u001b[0m│\n",
       "└──────────┴─────────┴─────────────────┴────────┘\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "ds = agent.run(ds, max_items=10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9e38a35a",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3.10.8 64-bit",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.16"
  },
  "vscode": {
   "interpreter": {
    "hash": "b0fa6594d8f4cbf19f97940f81e996739fb7646882a419484c72d19e05852a7e"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
